Skip to content

Karknu/tx undecision#5352

Draft
karknu wants to merge 67 commits intomainfrom
karknu/tx_undecision
Draft

Karknu/tx undecision#5352
karknu wants to merge 67 commits intomainfrom
karknu/tx_undecision

Conversation

@karknu
Copy link
Copy Markdown
Contributor

@karknu karknu commented Apr 9, 2026

Description

Txsubmission V2, but without a central decision thread.
WIP. WIP. WIP.

Checklist

Quality

  • Commit sequence makes sense and have useful messages, see ref.
  • New tests are added and existing tests are updated.
  • Self-reviewed the PR.

Maintenance

  • Linked an issue or added the PR to the current sprint of ouroboros-network project.
  • Added labels.
  • Updated changelog files.
  • The documentation has been properly updated, see ref.

Attempt to create a buildable version of whats released in cardano-node
10.7.0.
@github-project-automation github-project-automation Bot moved this to In Progress in Ouroboros Network Apr 9, 2026
@karknu karknu force-pushed the karknu/tx_undecision branch 2 times, most recently from e3189b8 to 47f2ed7 Compare April 14, 2026 07:50
@karknu karknu force-pushed the karknu/tx_undecision branch from b747b36 to 441a997 Compare April 17, 2026 08:48
karknu added 23 commits April 27, 2026 10:13
In case of duplicate txids, a second invalid tx could be counted as
valid because it was tracked by txid alongside the valid duplicate,
conflating the two.
Add a test for ensuring that a peer can download all valid TXs when
faced with two peers with conflicting tx order.
Replace the central decsision thread by having server threads coordinate
between them by blocking on STM actions.
Use sharedGeneration to track if the sharedState really changed and only
write to the tvar if it changed.

This makes common operations like receiving and acking a txid that is
already retained something that only affects the peer local state.
Remove shared states that where only written to.
Have the peer update its score during phase change. This makes State's
idle calculation quicker since peers will drain back to 0.

Mark pacIdlePeerScores as lazy to avoid doing the calculations when it
isn't needed.
Add benchmarks for V2
Instead of traversing all peers only touch the peers that need to wake
up.
Avoid copying the map when encountering an existing advertiser.
We don't need to track the ack state, it can be derived by the time we
decide to ack a TX. So txAdvertisers can just be a Set member check.
Remove the tracking of advertisers in TxEntry.
This change also changing how scoring is used to rank peers.
A peers score affects how long time after the TX owners lease expire
they can wake up and attempt to claim it. This means that
acknowledgement/downloading requires minimal coordination between peers.
The first peer that advertised a new txid always got a lease on it.
This is a problem since the peer may be at capacity and unable to
request the TX.
Update the behaviour so that anyone peer that advertise a txid can gain
the first claim.
Replace the local state tvar with V1's Stateful types.
Improve comment around deletion of retaind txs.
The retained functions in Types.hs are just small wrappes around IntPSQ
functions.
Use nothunks to assert that there are no thunks after some property
based tests.
If there is at least one TX outstanding don't ack the final txid in the
window
We don't track peers that don't have an ongoing attempt any longer so
TxNoAttempt isn't needed.
karknu added 4 commits April 27, 2026 10:48
Inline some hot path functions. The old NOINLINE pragmas where a leftover
from V1, and not applicable to V2 since we don't run unsafeNoThunks
between states.
Not only idle peers but all peers should be notified when something
changes for one of their TX.s
When picking TXs to submit to the mempool stop at gaps.
When picking TXs to download use the peer's advertisement order as a
guide, not the TxKey order.
Test the peer score functionality.
Use arbitrary TxDecisionPolicy instead of the defaultTxDecisionPolicy.
@karknu karknu force-pushed the karknu/tx_undecision branch from 086f290 to cff5d5d Compare April 27, 2026 11:01
karknu added 25 commits April 28, 2026 10:09
When peers have conflicting advertisement orders for TXs it is possible
for them to form a dependancy loop.

For example peer A has tx id 1 but requires tx id 0 to submit, peer B has
txid 2 but requires tx id 1 to submit, peer C has txid 0 but requires
txid 2 before it can submit.

We break this loop by introducing a inflightTimeout. When a peer has
spent that much time unable to get the TX into the mempool it will bump
the new currentMaxInflightMultiplicity limit. This will allow another
peer to issue a new request for the TX.
Add tests that verify that V2 is robust even if peers are late or omit
TXs in replies.
Add prop_txSubmission_score_impairment which verifies TX submisssion
V2's scoring functionality.
Test nextPeerAction but running it through a series of triggers for a
small group of peers.
checkNoThunks was always true, unsafeNoThunks wasn't really called.
seq forces val to WHNF, but deeper thunks will be detected.
With no-thunk checks in prop_nextPeerAction_processesAllTriggers the
prop_nextPeerAction_prioritisesSubmit isn't needed.
Convert trivial nextPeerAction_claimsRejectedTxFromOtherAdvertiser
to a unit tests.
Fold prop_nextPeerAction_claimsExpiredLease into
prop_nextPeerAction_claimsClaimableTx which is expanded
to cover both case which is expanded
to cover both casess
Convert the direct server benchmark into one that can run multiple peers
with async.
Avoid Double conversions, similar to muxs diffTimeToMicroseconds.
Use IntSet for faster lookups for retained keys.
IntPSQ is still needed for quick removal of expeired entries.
Reduce STM contention in V2 TxSubmission inbound by splitting per-peer
in-flight bookkeeping out of SharedTxState into a small per-peer
PeerTxInFlight TVar. SharedTxState is now only written when the shared
state updates. The common case of a new peer advertising an existing
txid is just a read operation for the shared state and a write operation
into the peer local TVar.
Break out mempoolGetSnapshot into its own atomic transaction.
Add the conter thread to the test cases.
Add meta tests for AppV2 generators/shrinkers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant